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Abstract 

We consider the problem of self-healing in networks 
that are reconfigurable in the sense that they can change 
their topology during an attack. Our goal is to maintain 
connectivity in these networks, even in the presence of 
repeated adversarial node deletion, by carefully adding 
edges after each attack. We present a new algorithm, 
DASH, that provably ensures that: 1 ) the network stays 
connected even if an adversary deletes up to all nodes 
in the network; and 2) no node ever increases its degree 
by more than 2 log n, where n is the number of nodes 
initially in the network. DASH is fully distributed; adds 
new edges only among neighbors of deleted nodes; and 
has average latency and bandwidth costs that are at 
most logarithmic in n. DASH has these properties irre- 
spective of the topology of the initial network, and is thus 
orthogonal and complementary to traditional topology- 
based approaches to defending against attack. 

We also prove lower-bounds showing that DASH 
is asymptotically optimal in terms of minimizing maxi- 
mum degree increase over multiple attacks. Finally, we 
present empirical results on power-law graphs that show 
that DASH performs well in practice, and that it signif- 
icantly outperforms naive algorithms in reducing maxi- 
mum degree increase. 



1. Introduction 

On August 15, 2007 the Skype network crashed for 
about 48 hours, disrupting service to approximately 200 
million users [8, 13, 16, 19, 20]. Skype attributed this 
outage to failures in their "self-healing mechanisms" [2]. 
We believe that this outage is indicative of a much 
broader problem. Modern computer systems have com- 
plexity unprecedented in the history of engineering: we 
are approaching scales of billions of components. Such 
systems are less akin to a traditional engineering enter- 
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prise such as a bridge, and more akin to a living organ- 
ism in terms of complexity. A bridge must be designed 
so that key components never fail, since there is no way 
for the bridge to automatically recover from system fail- 
ure. In contrast, a living organism can not be designed 
so that no component ever fails: there are simply too 
many components. For example, skin can be cut and still 
heal. Designing skin that can heal is much more practi- 
cal than designing skin that is completely impervious to 
attack. Unfortunately, current algorithms ensure robust- 
ness in computer networks through hardening individual 
components or, at best, adding lots of redundant compo- 
nents. Such an approach is increasingly unscalable. 

In this paper, we focus on a new, responsive approach 
for maintaining robust networks. Our approach is re- 
sponsive in the sense that it responds to an attack (or 
component failure) by changing the topology of the net- 
work. Our approach works irrespective of the initial 
state of the network, and is thus orthogonal and comple- 
mentary to traditional non-responsive techniques. There 
are many desirable invariants to maintain in the face of 
an attack. Here we focus only on one of the simplest 
and most fundamental invariants: maintaining network 
connectivity. 

The responsive approach will only work on networks 
that are reconfigurable, in the sense that the topology 
of the network can be changed. Not all networks have 
this property. However, many large-scale networks are 
reconfigurable. For example, peer-to-peer and overlay 
networks are reconfigurable, as are wireless and mo- 
bile networks. More generally, many social networks, 
such as a company's organizational chart; infrastructure 
networks, such as an airline's transportation network; 
and biological networks, such as the human brain, are 
also reconfigurable. The increasing importance of these 
types of networks calls for new mathematical and algo- 
rithmic methods to study and exploit their flexibility. 

Our Model: We now describe our model of attack and 
network response. We assume that the network is ini- 
tially a connected graph over n nodes. We assume that 
every node knows not only its neighbors in the network 
but also the neighbors of its neighbors i.e. neighbor- 
of-neighbor (NoN) information. In particular, for all 
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nodes x,y and z such that a; is a neighbor of y and y 
is a neighbor of z, x knows z. There are many ways 
that such information can be efficiently maintained, see 
e.g. [14, 18]. 

We assume that there is an adversary that is attacking 
the network. This adversary knows the network topol- 
ogy and our algorithm, and it has the ability to delete 
carefully selected nodes from the network. However, we 
assume the adversary is constrained in that in any time 
step it can only delete a small number of nodes from the 
network 1 . We further assume that after the adversary 
deletes some node x from the network, that the neigh- 
bors of x become aware of this deletion and that they 
have a small amount of time to react. 

When a node x is deleted, we allow the neighbors of 
x to react to this deletion by adding some set of edges 
amongst themselves. We assume that these edges can 
only be between nodes which were previously neighbors 
of x. This is to ensure that, as much as possible, edges 
are added which respect locality information in the un- 
derlying network. We assume that there is very limited 
time to react to deletion of x before the adversary deletes 
another node. Thus, the algorithm for deciding which 
edges to add between the neighbors of x must be fast 
and localized. 

Our Results: We introduce an algorithm for self- 
healing of reconfigurable networks, called DASH (an 
acronym for Degree based Self -Healing). DASH is 
locality-aware in that it uses only the neighbors of 
the deleted node for reconnection. We prove that 
DASH maintains connectivity in the network, and that 
it increases the degree of any node by no more than 
O(logn). During reconnection of nodes, our algorithm 
uses only local information, therefore, it is scalable and 
can be implemented in a completely distributed man- 
ner. Algorithm DASH is described as Algorithm 1 in 
Section 2. The main characteristics of DASH are sum- 
marized in the following theorem that is proved in Sec- 
tion 2. 

Theorem 1. DASH guarantees the following properties 
even if up to all the nodes in the network are deleted: 

• The degree of any vertex is increased by at most 
2 log n. 

• The number of messages any node of initial de- 
gree d sends out and receives is no more than 
2(d + 2 log n) In n with high probability 2 over all 

'Throughout this paper, for ease of exposition, we will assume that 
the adversary deletes only one node from the network before the al- 
gorithm responds. However, our main algorithm, DASH, can easily 
handle the situation where any number of nodes are removed, so long 
as the neighbor-of-neighbor graph remains connected. 

throughout this paper, we use the phrase with high probability 
(w.h.p) to mean with probability at least 1 — l/n° for any fixed con- 
stant C. 



node deletions. 

• The latency to reconnect is 0(1) after attack; and 
the amortized latency to update the state of the 
network over 6{n) deletions is O(logn) with high 
probability. 

We also prove (in Section 3) the following lower bound 
that shows that DASH is asymptotically optimal. 

Theorem 2. Consider any locality-aware algorithm 
that increases the degree of any node after an attack by 
at most a fixed constant. Then there exists a graph and a 
strategy of deletions on that graph that will force the al- 
gorithm to increase the degree of some node by at least 
log n. 

We also present empirical results (in Section 4) show- 
ing that DASH performs well in practice and that it sig- 
nificantly outperforms naive algorithms in terms of re- 
ducing the maximum degree increase. Finally (in Sec- 
tion 4) we describe SDASH, a heuristic based on DASH 
that we show empirically both keeps node degrees small 
and also keeps shortest paths between nodes short. 

Related Work: There have been numerous papers that 
discuss strategies for adding additional capacity and 
rerouting in anticipation of failures [7, 9, 12, 17, 21, 22]. 
Here we focus on results that are responsive in some 
sense. Medard, Finn, Barry, and Gallager [15] propose 
constructing redundant trees to make backup routes pos- 
sible when an edge or node is deleted. Anderson, Bal- 
akrishnan, Kaashoek, and Morris [1] modify some exist- 
ing nodes to be RON (Resilient Overlay Network) nodes 
to detect failures and reroute accordingly. Some net- 
works have enough redundancy built in so that separate 
parts of the network can function on their own in case 
of an attack [11]. In all these past results, the network 
topology is fixed. In contrast, our algorithm adds edges 
to the network as node failures occur. Further, our al- 
gorithm does not dictate routing paths or specifically re- 
quire redundant components to be placed in the network 
initially. In this paper, we build on earlier work done in 
[5, 6], which proposed a simple line algorithm for self- 
healing to maintain network connectivity. 

Table of Contents: The rest of our paper is organized as 
follows. Section 2 describes the algorithm DASH, and 
its theoretical properties. Section 3 gives a lower bound 
on locality-aware algorithms. Section 4 gives empirical 
results for DASH, and several other simple algorithms 
on random power-law networks. It also describes and 
gives results for SDASH. We conclude and give areas 
for future work in Section 5. 



2. DASH: An Algorithm for Self-Healing 

In this Section, we describe DASH and prove cer- 
tain properties about it. In brief, when a deletion occurs, 
DASH asks the neighbors of the deleted node to recon- 
nect themselves into a certain kind of complete binary 
tree. Then messages are propagated so that the nodes 
can keep track of which connected component they be- 
long to. 

Let the actual network at a particular time step be 
G(V, E). Let E' be the edges (i.e. healing edges), that 
have been added by the algorithm up to that time step 
(note E' C E). Let G = (V, E'). We show that G is a 
forest in Lemma 1 . 

2.1. DASH: Degree based Self-Healing 



Algorithm 1 DASH: Degree-Based Self-Healing 
1: Init: for given network G(V, E), Initialize each ver- 
tex with a random number ID between [0,1] se- 
lected uniformly at random. 
2: while true do 
3: If a vertex v is deleted, do 

4: Nodes in U N(v, G) U N(v, G) are reconnected 
into a complete binary tree. To connect the tree, 
go left to right, top down, mapping nodes to 
the complete binary tree in increasing order of 5 
value. 

5: Let MINID be the minimum ID of any node 
in UN(v, G) U N(v, G). Propagate MINID to 
all the nodes in the tree of UN(v, G) U N(v, G) 
in G . All these nodes now set their ID to 
MINID. 

6: end while 



As the acronym suggests, DASH employs informa- 
tion of previous degree increase to control further degree 
increase for a node. When a deletion occurs, we assume 
the neighbors of the deleted node are able to detect the 
deletion. Then they employ DASH to heal. To maintain 
connectivity, DASH connects the neighbors of a deleted 
node as a binary tree. The tree is structured so that the 
vertices which have incurred the maximum degree in- 
crease previously get to be leaves and thus not increase 
their degree in this round. Notice that at least half the 
vertices in a binary tree are leaves. The nodes main- 
tain information about the virtual network and their con- 
nected component in this network. The algorithm tries 
to use only a single node from each component during 
reconnection and thus adds only a low number of new 
edges during healing. 

To describe DASH we give some definitions. Let 
N(v, G) be the neighbors of vertex v in the graph G rep- 
resenting the real network. Let N(v, G) be the neigh- 
bors of vertex v in graph G consisting of the edges 
added by the healing algorithm. Let S(v) be the degree 
increase of the vertex v compared to its initial degree. 
Note that this is not the same as the degree of v in G. 

When a node v is deleted, partition on the basis of 
their ID all the neighbors of v in G (not having the same 
ID as v). Let UN(v, G) (Unique Neighbors) be the set 
having one representative from each of the partitions. If 
there is more than one node as a possible representative 
from a partition, we include the one with the lowest ini- 
tial ID. 

Note that UN(v, G)nN(v, G) = 4> and UN(v, G)U 
N(v, G) C N(v, G) . The ID of a node allows us to 
keep track of which connected component in G it be- 
longs to. The lowest ID of any node in that component 
is broadcast and all the nodes in the component take on 
this ID. 



Our main results about DASH are stated in Theo- 
rem 1. 

Theorem 1. DASH is a distributed algorithm with the 
following properties: 

• The degree of any vertex is increased by at most 
2 log n. 

• The latency to reconnect is 0(1). 

• The number of messages any node of degree d sends 
out and receives is no more than (2d+ 2 log n) In n 
with high probability over all node deletions. 

• The amortized latency for ID propagation is 
O(logn) with high probability over all node dele- 
tions. 

2.2. Proof of Theorem 1 

For analysis, we use the following definitions: 

• Let T(x, y) be the tree in G — y that contains x. 

• Each vertex v will have a weight, w(v) . The weight 
of a vertex will start at 1 and may increase during 
the algorithm. If v is deleted, w(v) is added to an 
arbitrarily chosen neighbor in G . 

• Let W(S) = w ( v )> for a g ra P h S(V,E) i.e. 

vev 

the sum of the weights of all vertices in S. 

• For vertex v, let rem(w) = 

Y,W(T(u,v))- m&x(W(T(u,v)))+w(v). 



We will show that as the degree of a vertex in- 
creases in our algorithm, so will the rem value of 
that vertex. Intuitively rem(v) is large when re- 
moving v from its tree in G' gives rise to many 
connected components with large weight. 

Lemma 1. The edges added by the algorithm, E' , form 
a forest. 

Proof. We prove this by induction on the number of 
nodes deleted. 

Base Case: Initially, G' is a forest because E' is empty. 



Lemma 3. For any node v, for all nodes q G N(v, G') 
, W(T(v,q)) > rem(v). 




Figure 1 . W(T(v, m)) > rem(v). 



We note that E' and G' change only when a deletion 
occurs. Consider the i th deletion and let v be the node 
deleted. 

Let v belong to tree T v in G' just prior to the deletion 
of v. Now, for all x, y € N(v , G') x and y are not con- 
nected in E' since that would have implied the existence 
of a cycle through v contradicting the Inductive Hypoth- 
esis. Note also that for all z <G UN(v, G),z T v . Since 
we select only 1 node from each tree Tj in which v had 
a neighbor, no pair of nodes in UN(v, G) U N(v, G 1 ) 
are connected in G' . We reconnect all the nodes in 
UN(v, G)LiN(v, G') in a Binary Tree and propagate the 
minimum ID. Since we are adding edges between nodes 
which previously were in separate connected compo- 
nents in G", no cycles are introduced. Hence, G' remains 
a forest. 

□ 

Lemma 2. For any vertex v, rem(v) is non-decreasing 
over any vertex deletion where v has not been deleted. 

Proof. By Lemma 1, every vertex v in G' belongs to 
some tree, which we will call T v . For every T v in G", 
W(T V ) is the sum of the weights of all vertices in T v . 
By definition, rem(v) = 

^2 W(T(u,v)) - max(W(T(u,v))) + w(v). 



Proof. For all nodes q, 



u£N(v,G') 



u£JV(o,G') 



Therefore, 

remtv) = W(T V ) - max W(T(u,v)) 

u£N(v,G') 

Observe first that W(T V ) cannot decrease even when 
there is a deletion in T v because the deleted vertex's 
weight is not "lost", but added to some member of T v . 

Since W(T V ) cannot decrease, rem(v) can only de- 
crease if the maximum subtree weight increases more 
than W(T V ). Since the maximum subtree is a subset of 
the tree, T v , any increases or decreases in the maximum 
subtree is also counted in W(T V ). Thus, rem(v) cannot 
decrease. 

□ 



W(T(v,q)) 



W(T(u,v))+w(v) 



new(„,G') 



> J2 W{T{u,v)) 

ueN(v,G') 

— max W(T(u, v)) + w(v) 

ueN(v,G') 

— rem(v) 

For example, in figure 1, W(T(V,M)) = 
W{T(L, V)) + W{T(R, V)) + w{v) > rem{v). □ 

Lemma 4. For any node v, rem(v) > 2 S ^/ 2 , where 
S(v), as defined earlier, is the degree increase of the ver- 
tex v in G. 

Proof. Let t be the number of rounds of healing where 
a round is a single adversarial deletion followed by 
self-healing by DASH. We prove this lemma by 
induction on t. 

Let G' t , rem t (v) and 6 t (v) be G', rem(v) and S(v) 
respectively at time t. 

Base Case: t = 0: In this case, all nodes v have 

S(v) = 0; rcm(u) = 1. Thus, rem(w) > 2°. 

Inductive Step: Consider the network at round t. We 
assume by the inductive hypothesis that for all nodes v 
in G', rem t _i(w) > 2 5t - l( - v ~>/ 2 . Our goal is to show that 
iem t (v) > 2 St(v ^ 2 . 

Suppose node x was deleted at round t. According 
to our algorithm, some or all of the neighbors of x will 
be reconnected as a binary tree. Let us call this tree RT 
(short for Reconstruction Tree). Let T(x, y) be the tree 
in G' t _ 1 — y that contains x, and T'(x, y) be the tree in 
G' t — y that contains x. 

Consider a surviving vertex v. If v is not a part of RT, 
then by a simple application of lemma 2, our induction 
holds. If v is a part of RT, there are 3 possibilities: 



1. v is a leaf node in RT 

The degree of v did not change. Thus, S t (v) = 
S t -i(v). By Lemma 2, rcm t (v) > rem t _i(w). 
Thus, using the induction hypothesis, rem t (w) > 
2**(«)/ 2 . 

2. v is the root of RT 




H H' 



Figure 2. node v is the root, with 2 children 

If v has only one child in RT, then this is the same 
as the previous case with the parent and child role 
reversed and the induction holds. Let us consider 
the case when v has two children in RT. Now, 
S t (v) has increased by 1. Let z be the neighbor 
of v such that W(T(z, v)) is the largest among all 
neighbors of v except x. Note that W(T'(z, v)) = 
W(T(z, v)), since this subtree was not involved in 
the reconstruction. Consider the possibly empty 
subtree of v rooted at z. Let the two children of 
v in RT be w\ and w 2 , as illustrated in figure 
2. By our algorithm, we know that S t -i(wi) > 
S t -i(v) and S t -i(w 2 ) > 5 t -i(v). Thus, using the 
inductive hypothesis and lemma 3, we have that 
W(TOi,x)) > rem t _iOi) > 2 St - l( - Wl ^ 2 and 
W(T(w 2 ,x)) > rem t _!(u; 2 ) > 2^-^ w ^l 2 . By 
lemma 2, this implies that in G' t , 

W{T'( Wl ,v)) > 2' 5 *-i("'i)/ 2 > 2 8t - l(v)/2 
W(T'(w 2 ,v)) >2 5 *-i(^)/2 >2 <5t - 1 ^)/ 2 

Assume without loss of generality that 

W(T'Oi,v)) < W{T'(w 2 ,v)). There are 
two cases: 

(a) W(T(z,v)) < W(T'(wi,u)) 

In this case rem t _i(w) did not include 
W(T(x,w)). But rem t (w) will include 
W{T'( Wl ,v)) Hence, 

rem t (v) > rem t _i(i;) + W(T'(wi, v)) 

= 2 (S t (v) + l)/2 



(b) W(T(z,w)) > W(T( Wl ,v)) 

In this case rem t (v) will include 
W(T'Oi,w)) and the smaller of 
W(V(w2,v)) andW(T'(z,i;)). Note that by 
Lemmas 3 and 2, the inductive hypothesis, 
and the fact that <5 t _i(wi) > (5 t _!(w), 
W(T'(wi,v)) > remt(wi) > iem t (wi) > 

2 5 t _i(«Ji)/2 > 2 6t-i(i>)/2_ 

Also, since by assumption W(T'(w 2 ,v)) > 
W(T'(w!,v)), we know that 
W(T'(w 2 ,v)) > 2 5 t-^' 2 . 
Further, since W(T'(z,v)) = W(T(z,v)) > 
W(T'(«7i,w)) we know that W{T'(z,v)) > 

2 5 t _i(u)/2_ 

Hence, 

iem t {v) > 2 St -^/ 2 + 2 5t -^/ 2 

_ 2 (>5t-iW+2)/2 
= 2 (5 t (t,) + l)/2 



3. v is an internal node in T' 




h h- 



Figure 3. Internal node v with 1 child 
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Figure 4. Internal node v with 2 children 

For node v to become an internal node, the deleted 
neighbor x must have at least three other neighbors. 
Three neighbors of x are shown as CI, C2 and P 
in the figures 3 and 4. Also, now u's degree can 
increase by 1, as illustrated in figure 3, or by 2, as 
illustrated in figure 4. Let us consider these cases 
separately: 

(a) 5 t {v) = S t -i{v) + l 

This can only happen when v has a parent and 
a single child in RT as in figure 3. Let P be 
the parent of v and CI the child of v. CI has 
to be a leaf node since the tree is complete 
and v has only one child. Observe that there 



exists at least one leaf node besides C\ in the 
tree, accessible to v only via P. Let this node 
be C2 and let P2 be its parent. Note that P2 
and P may even be the same node. In our 
algorithm, any leaf node in RT has a S value 
no less than the 5 value of any internal node. 
Thus, 

St-i(Cl) > 5 t -i(v); and 

These inequalities, Lemmas 2 and 3, and the 
Inductive Hypothesis, imply that 



W(T'(Cl,v)) 


> 


rcm t (Cl) 




> 


rem t _i(Cl) 




> 


2 <5t-i(v)/2 


W(T'(C2,P2)) 


> 


rem t (C2) 




> 


rem^C^) 




> 


2 <5t-i(«)/2 


W(T(v,x)) 


> 


rcm t (v) 




> 


rem t _i(w) 




> 


2<5 t -i(«)/2_ 



Since rcm t (u) can exclude at most one 

of W{T'(Cl,v)), W{T'(C2,P2)) and 
W(T(v,x)), 

iem t {v) > 2 St -^' 2 + 2 5t -^' 2 

= 2 (S t (v) + l)/2 

(b) 5 t (v) = S t - 1 (v)+2 

In this case v has two children in RT, CI and 
C2, as illustrated in figure 4. The analysis is 
similar to the case above. The value rem 4 (v) 
can exclude at most one of W(T'(Cl,v)), 
W{T'(C2,v)) and W{T(v,x)) and we can 
show that all three of these values are at least 
2 «t-i(«)/2 > Thus,rem t (w) > 2^ v »/ 2 . 

Hence, the induction holds. 

□ 

Lemma 5. For all vertices v, rem(v) is always no more 
than n. 

Proof. No vertex is counted twice in a rem value since 
the subtrees of a vertex are disjoint. Since the number of 
vertices in the subtrees cannot be more than the number 
of vertices remaining, the rem value is always no more 
than the sum of the weights of all undeleted vertices in 

a. 



Define W* to be the sum of weights of all undeleted 
vertices in G'. After initialization, W* = n, since there 
are n vertices. At each step of the algorithm, W* = n , 
since the weight of the deleted vertex is added to one of 
the remaining vertices. Thus, for node v, rem(v) < n. 

□ 

Lemma 6. DASH increases the degree of any vertex by 
at most O {log n). 

Proof. Every vertex v starts with rcm(u) = w(v) = 1. 
We know that rem(v) > 2 S ^/ 2 by Lemma 4. since 
rem(v) is at most n, 2 S ( V ^ 2 < n . Taking log of both 
sides, S(v)/2 < logn. Solving for S(v) gives 8(v) < 
2 log n. 

□ 

Lemma 7. The latency to reconnect the network in 
DASH is 0(1). 

Proof. During the reconnection process, DASH re- 
quires communication only between nodes one hop 
away, thus, the latency is just O(l). □ 

Lemma 8. The number of messages any node of ini- 
tial degree d sends out and receives is no more than 
2(d + 2 logn) Inn with high probability over all node 
deletions. 

Proof. In DASH, after the reconnections have been 
made, messages are sent out by nodes when the mini- 
mum ID has to be propagated. With similarity to the 
record breaking problem [10], it is easily shown that 
w.h.p., a node has its ID reduced no more than 2 Inn 
times, where the record is the node's ID. These are 
the only messages the node needs to transmit or re- 
ceive. Each time its ID changes, the node sends this 
message to all its neighbors, Thus, it sends or receives 
0((d + log n) In n) messages, since the final degree of 
the node is at most d + 2 log n. 

□ 

Lemma 9. The amortized latency for ID propagation is 
0(log n) with high probability over all node deletions. 

Proof. Again, with similarity to the record breaking 
problem, a node sends messages to its neighbors (neigh- 
bors, by definition, are a single hop away) only 0(log n) 
times with high probability. Thus, messages are trans- 
mitted O(nlogn) times over all the nodes. Over 0(n) 
deletions, this implies that the amortized latency for 
messages (involving ID propagation) is only O(logn) 

□ 

2.3. Proof of Theorem 1 

The proof of Theorem 1 now follows immediately 
from Lemmas 6, 7, 8 and 9. 



3 Lower bounds on Locality-aware algo- 
rithms 

3.1. Necessity of Component tracking for 
healing strategies 

To begin with, we give an insight as to why a healing 
strategy might need to keep track of connected compo- 
nents. 

Lemma 10. For a tree, deletion of a node of degree d 
increases the sum total of degrees of its neighbors by 
d — 2 for a locality-aware acyclic healing strategy. 

Proof. A locality-aware acyclic healing strategy will re- 
connect the neighbors of a deleted node without cre- 
ating any cycles. If there were no cycles in the origi- 
nal graph involving the neighbors and not involving the 
deleted node, then such a strategy can only reconnect 
these neighbors as a tree to maintain their connectivity. 

A node of degree d has d neighbors. Since it was 
part of a tree, this node and its neighbors also constitute 
a tree. Let us call this the immediate subtree. The im- 
mediate subtree had d edges and a total of 2d degrees. 
These d neighbors are now reconnected as a tree with 
d—1 edges and 2(d — 1) degrees. Each of these neigh- 
bors lost a single degree due to the deletion of their edge 
to the deleted node. Thus, the total degrees gained on 
reconstruction are 2(d — 1) — d = d — 2. 

□ 



3.2 A lower bound on healing by 
Degree-bounded locality-aware 
healing algorithms 

We now prove our result regarding the lower bounds 
for locality-aware algorithms in Theorem 2. Our lower 
bound occurs on graphs that are originally trees. To state 
the proof, we need to prove some other lemmas. 

First, we define the following operation that the ad- 
versary can perform on trees, where we assume self- 
healing is applied after every deletion: 

Prune (r,s) : For a node r and its subtree headed by 
node s, the Prune operation on s leads to deletion 
of all the nodes in that subtree including s. This op- 
eration can be accomplished by repeatedly deleting 
leaf nodes in the subtree till all the nodes including 
s are deleted. 
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Figure 5. Steps in Prune(v,x). Leaf nodes 
are deleted at each step. 



It is reasonable to assume that an efficient healing al- 
gorithm adds close to the minimum possible edges at 
each step to maintain connectivity of the neighbors of 
the deleted node. In G', if a deleted node v had two 
neighbors which had an alternate path between them- 
selves not involving v, then the algorithm may need to 
use only one of them for reconnection to other nodes. 
By extension, if there were many neighbors which had 
alternate connections between them, the algorithm may 
need to use only one of these nodes. This is equiva- 
lent to stating that the algorithm may need to use only 
one node from a connected component. Knowing that 
certain nodes are in the same component would allow 
the algorithm to do this. G' is comprised only of edges 
added by the healing algorithm, and is always a forest. If 
the adversary mainly deletes nodes with degree greater 
than 2 and the algorithm does not use the component in- 
formation, the sum total of degrees of the neighbors of 
the deleted nodes will increase by (d — 2) i.e. at least 
1, at each step. After many (0(n)) deletions, only a few 
nodes will be left, and these will have 0(n) degree in- 
crease. 



Lemma 11. Deletion of a node with degree at least 3 
increases the degree of at least one node by degree 1, no 
matter how the healing occurs. 

Proof. Any reconnection of more than two nodes has 
a 3-node line (as in figure 6) as a subgraph. Here the 
internal node has a degree increase of 1. Thus, at least 
one node increases it's degree by at least 1. 




2 

Figure 6. An internal node in a 3-node line 
reconnection suffers a degree increase. 

□ 

For further discussion, we define the following: 



Degree-bounded / M-degree-bounded : A healing al- 
gorithm is degree-bounded or M-degree-bounded if 
any node can increase its degree by at most M in a 
single round of deletion and healing. 

Lemma 12. Consider a M-degree-bounded locality- 
aware healing algorithm used on a tree. In such a situa- 
tion, deletion of a node v with degree at least M +3 leads 
to degree increase for at least two neighbors of v. . 

Proof. Node v has M + 3 neighbors. By Lemma 10, the 
sum total of degree increase of neighbors is M+l, when 
the graph is a tree. Since one node can get a maximum 
degree increase of M, at least one node has to incur the 
rest of the degree increase. Thus, at least two nodes have 
to increase their degrees. 

□ 
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Figure 7. M+2 -ary Tree 



Algorithm 2 LevelAttack: level-by-level attack on a 
(M+2)-ary tree 
1: Consider an (M+2)-ary tree T of depth D with lev- 
els numbered to D, the root being at level 0. 

2: i<— D -1 

3: while i > do 

4: for each node v at level i do 

5: if v has c > M + 2 children remove the excess 
c — (M + 2) nodes by deleting those with least 
degree increases and their subtrees by using the 
Prune operation, so that v now has M + 2 chil- 
dren. 

6: delete v. 

i: end for 

8: — l 

9: end while 



Here, we introduce a new attack strategy: 

LevelAttack: This strategy is described in Algo- 
rithm 2. In brief, the adversary deletes nodes one 



level at a time beginning one level above the leaves 
of a M + 2-ary complete tree going up to the root. 
The reasoning behind the strategy is the following: 
If the adversary deletes a node of degree M + 3 in 
a tree, this ensures that a degree increase of at least 
1 is passed to its children. What the adversary must 
do is to ensure that logn of these degree increases 
are credited to the same node. 

Lemma 13. Assume a (M + 2) — ary tree T, a degree- 
bounded locality-aware healing algorithm and the LEV- 
ELATTACK adversarial strategy. Then, when LEVEL AT- 
TACK deleted a node at level i, < i < D some leaf 
node of the original tree increases its degree by at least 
D-i. 

Proof. The proof is by induction. 

Base case: In the LevelAttack strategy, the nodes 
at level D — 1 are deleted first. Thus, a deletion of a 
node at D — 1 is our base case. A node at level D — 1 
has M + 3 neighbors. By lemma 12, there is at least one 
leaf node that increases its degree by 1 or more. Thus, 
the base case holds. 

Inductive step: Assume the hypothesis holds for 
nodes at level i + We now show that it holds for nodes 
at level i. Consider a node, say X at level i > . It had 
M + 2 children at level i + By the inductive hypothe- 
sis, each of these deletions led to at least one node with 
degree D — (i + 1). Moreover, X is not among these 
M + 2 nodes. Moreover, all of these are now neighbors 
of X, since X itself was involved in each of these dele- 
tions. The Prune algorithm in step 5 retains only these 
M + 2 as children of X. Each of these children has de- 
gree increase D — + and was originally a leaf node 
of T. The adversary now deletes X. By lemma 12, at 
least one of these children incurs a degree increase. 

□ 

Theorem 2. Consider any locality-aware algorithm 
that increases the degree of any node after an attack by 
at most a fixed constant. Then there exists a graph and a 
strategy of deletions on that graph that will force the al- 
gorithm to increase the degree of some node by at least 
log n. 

Proof. It is sufficient to give a graph and an attack strat- 
egy such that any degree-bounded locality-aware heal- 
ing algorithm will have to increase a particular node's 
degree by log n. Let M be the constant degree increase 
that is the maximum that the healing algorithm can im- 
pose on any one node in the graph. Then, for a graph 
which is a full (M+2)-ary tree ( Figure 7), the adversary 
uses LevelAttack. 

Consider a (M+2)-ary tree T of depth D with levels 
numbered to D. By lemma 13, after the last deletion in 
the adversary strategy, which is the deletion of the root 



of T i.e. the node at level there is at least one node left 
which has a degree increase of D. Since D is O(logn), 
this adversary strategy achieves a degree increase of at 
least 0(logn). 

□ 

4. Experiments 

We carried out a number of experiments to ascertain 
the performance of various healing algorithms. We used 
a number of attack strategies to measure how different 
healing strategies performed with regard to degree in- 
crease and stretch, where stretch is the maximum ratio 
of distance increase in the healed network compared to 
the original network, over all pairs of nodes. Our em- 
pirical results on stretch and a heuristic for maintaining 
low stretch are described in Section 4.6. 

4.1. Methodology 

Most of our experiments were conducted on random 
graphs. These graphs were generated by the Preferen- 
tial attachment model proposed by Barabasi [3, 4]. The 
experimental approach was the following: 

• For each graph size, for a particular deletion and 
healing strategy, repeat for 30 random instances of 
the graph: 

- Repeat while there are nodes in the graph: 

* delete a single node according to the 
deletion strategy. 

* repair according to the self-healing strat- 
egy- 

* measure the statistics (e.g. maximum 
change of degree for any node) for the 
graph. 

• average the statistics for each graph size. 

4.2. Attack Strategies 

The aim of the adversary is to collapse the network 
by trying to overload a node beyond it's maximum ca- 
pacity. There are many possible attack strategies. One 
strategy is to delete the node with the maximum degree. 
We call this the MaxNodestrategy. It would seem that 
a strategy that leads to additional burden on an already 
high burden node would be a good strategy. For the 
adversary, one good adversarial strategy is to continu- 
ously attack/delete a randomly chosen neighbor of the 
highest degree node in the network. We call this the 
Neighborof MaxStrategy(N M S) . This would also 



seem plausible as in a real network or the kind of net- 
works we are looking at, it would be reasonable that the 
hubs or the high degree nodes would be more well pro- 
tected and resilient to attack while their less significant 
neighbors should be easy to take down. 

4.3. Healing strategies 

We attempted various locality-aware healing strate- 
gies, some of which are the following: 

• Graph heal: On each deletion, we reconnect the 
neighbors of the deleted node in a binary tree re- 
gardless of whether we introduced any cycles in the 
graph formed by the new edges introduced for heal- 
ing. This seems to be a naive algorithm since the 
nodes use more edges than what are required for 
maintaining connectivity. 

• Binary tree heal: On each deletion, we reconnect 
the neighbors of the deleted node in a binary tree 
being careful not to introduce any cycles in the 
graph formed by the new edges introduced for heal- 
ing. This is done using random IDs which can then 
be used to identify which tree a particular node be- 
longs to. This is an improvement on the previous 
algorithm but still naive since it does not take into 
consideration the previous degree increase suffered 
by nodes during healing. 

• DASH (degree based binary tree heal): DASH is 
smarter than the previous algorithms as borne out 
by the results of the experiments. The DASH algo- 
rithm has been earlier described in Section 2. 1 and 
stated as Algorithm 1 . 

• SDASH ( Surrogate degree based binary tree heal): 
(described in Section 4.6.2) A heuristic based on 
DASH that tries to keep both node degrees and path 
lengths small. 

4.4. Degree increase 

The Neighborof MaxStrategy consistently re- 
sulted in higher degree increase, hence, we report results 
for only this attack strategy. Our experimental results 
clearly show thatO DASH and SDASH are good heal- 
ing strategies. It performed well against both adversary 
strategies. Figure 8 shows that DASH and SDASH have 
much lower degree increase than the other more naive 
strategies. Also, this degree increase was less than log n, 
which is consistent with our theoretical results. SDASH 
has the additional nice property that it keeps path lengths 
small over multiple adversarial deletions. 
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Figure 8. Maximum Degree increase: 
DASH vs other algorithms 



4.5. Messages 



Figure 9(a) shows that the number of time a nodes 
ID changes is less than log n, as expected, for all heal- 
ing strategies. Figure 9(b) shows the maximum number 
of messages a node sent out for the different strategies. 
Note that the number of messages a node sends out has 
to be less than or equal to the number of times a node 
changes ID times the degree of the node. Thus, algo- 
rithms with higher degree increase perform poorly. 



4.6. Heuristics and experiments involv- 
ing Stretch 



4.6.1 Stretch 

Stretch is an important property we would also like our 
self-healing algorithms to minimize. The stretch for any 
two nodes is the ratio between their distance in the new 
healed network and their distance in the original net- 
work. Stretch for the network is the maximum stretch 
over all pairs of nodes. Stretch is also closely related 
to the diameter of the network. In some sense, maintain- 
ing low degree increase and low stretch are contradictory 
aims since a high-degree node will lead to shorter paths 
and possibly lower stretch in the network. 
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(b) Number of messages exchanged for Component(ID) information 
maintenance 

Figure 9. ID changes and Messages ex- 
changed per node 



4.6.2 SDASH: a strategy with good empirical re- 
sults 

SDASH is an algorithm we have devised which em- 
pirically has both low degree increase and low stretch. 
During self-healing, we say a node surrogates if it re- 
places its deleted neighbor in the network, i.e. it takes 
all the connections of the deleted neighbor to itself. Sur- 
rogation never increases stretch since the paths never in- 



crease in length. In certain situations, it turns out that 
surrogation can be done without degree increase. In such 
situations, SDASH does surrogation else it simply ap- 
plies DASH. SDASH is described in Algorithm 3. 



Algorithm 3 SDASH: Surrogate Degree-Based Self- 
Healing 

1: Init: for given network G(V, E), Initialize each ver- 
tex with a random number ID between [0,1] se- 
lected uniformly at random. 

2: while true do 

3: If a vertex v is deleted, do 

4: Let to e UN(v,G) U N(v,G') be the node 

with Maximum degree increase (S) of all nodes 

in UN(v,G)UN(v,G'). 
5: if w G UN(v,G) U N(v,G') and 8(w) + 

\UN(v, G) U N(v, G')\ - 1 < 5(m) then 
6: connect all nodes in UN(v, G) U N(v, G') to 
w. 

i: else 

8: Nodes in UN(v,G) U N(v,G') are recon- 
nected into a complete binary tree. To connect 
the tree, go left to right, top down, mapping 
nodes to the complete binary tree in increasing 
order of 6 value. 

9: end if 

10: Let MINID be the minimum ID of any node 
in UN(v, G) U N(v, G'). Propagate MINID to 
all the nodes in the tree of UN(v, G) U N(v, G') 
in G' . All these nodes now set their ID to 
MINID. 

ii: end while 



As can be seen in the figures that follow, SDASH 
seems to allow a degree increase up to O(logn) and 
stretch up to 0(log n). We are working on proving the- 
oretical properties of this algorithm. 



4.6.3 Stretch: empirical results 

Figure 10 shows the performance of some of our 
algorithms for stretch. We determined that the 
MaxNodestrategy is most effective for the adversary 
when trying to maximize stretch and so our results in 
Figure 10 are against that adversarial strategy. The more 
naive degree-control healing strategies do a good job of 
minimizing stretch. However, it is important to keep in 
mind that these more naive algorithms increase the node 
degrees to a point where they are unlikely to be use- 
ful for many applications. In contrast, our experiments 
show that SDASH does a good job of minimizing both 
stretch and degree increase. 
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Figure 10. Stretch for various algorithms 



5. Conclusions and future work 

We have studied the problem of self-healing in net- 
works that are reconfigurable in the sense that new 
edges can be added to the network. We have described 
DASH, a simple, efficient and localized algorithm for 
self-healing, that provably maintains network connec- 
tivity, even while increasing the degree of any node by 
no more than O(logn). We have shown that DASH is 
asymptotically optimal in terms of minimizing the de- 
gree increase of any node. Further, we have presented 
empirical results on power-law networks showing that 
DASH significantly outperforms the naive algorithms 
for this problem. 

Several interesting problems remain open including 
the following: Can we not only maintain connectivity, 
but also provably ensure that lengths of shortest paths in 
the graph do not increase by too much? Can we remove 
the need for propagating IDs in order to maintain con- 
nected component information, or is such information 
strictly necessary to keep the degree increase small? 
Can we use the self-healing idea to protect invariants 
for combinatorial objects besides graphs? For example, 
can we provide algorithms to rewire a circuit so that it 
maintains essential functionality even when multiple 
gates fail? 
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